Similar Term Discovery using Web Search
نویسندگان
چکیده
We present an approach to the discovery of semantically similar terms that utilizes a web search engine as both a source for generating related terms and a tool for estimating the semantic similarity of terms. The system works by associating with each document in the search engine’s index a weighted term vector comprising those phrases that best describe the document’s subject matter. Related terms for a given seed phrase are generated by running the seed as a search query and mining the result vector produced by averaging the weights of terms associated with the top documents of the query result set. The degree of similarity between the seed term and each related term is then computed as the cosine of the angle between their respective result vectors. We test the effectiveness of this approach for building a term recommender system designed to help online advertisers discover additional phrases to describe their product offering. A comparison of its output with that of several alternative methods finds it to be competitive with the best known
منابع مشابه
Expert Discovery: A web mining approach
Expert discovery is a quest in search of finding an answer to a question: “Who is the best expert of a specific subject in a particular domain within peculiar array of parameters?” Expert with domain knowledge in any field is crucial for consulting in industry, academia and scientific community. Aim of this study is to address the issues for expert-finding task in real-world community. Collabor...
متن کاملA Technique for Improving Web Mining using Enhanced Genetic Algorithm
World Wide Web is growing at a very fast pace and makes a lot of information available to the public. Search engines used conventional methods to retrieve information on the Web; however, the search results of these engines are still able to be refined and their accuracy is not high enough. One of the methods for web mining is evolutionary algorithms which search according to the user interests...
متن کاملDiscovery of Term Variation in Japanese Web Search Queries
In this paper we address the problem of identifying a broad range of term variations in Japanese web search queries, where these variations pose a particularly thorny problem due to the multiple character types employed in its writing system. Our method extends the techniques proposed for English spelling correction of web queries to handle a wider range of term variants including spelling mist...
متن کاملWeb service clustering using text mining techniques
The idea of a decentralised, self-organising service-oriented architecture seems to be more and more plausible than the traditional registry-based ones in view of the success of the web and the reluctance in taking up web service technologies. Automatically clustering Web Service Description Language (WSDL) files on the web into functionally similar homogeneous service groups can be seen as a b...
متن کاملSimple Back-end Services for Corporate Semantic Web
In order to be adopted within corporate environments, Semantic Web applications must provide tangible short-/medium-term gains. Although corporate Semantic Web offers enterprises new possibilities for enhanced integration of heterogeneous business data, information discovery, and advanced automation of tasks, a cost-benefit analysis is in any case essential. In this paper, we argue that the mai...
متن کامل